Topic Labeling of Multilingual Broadcast News in the Informedia Digital Video Library

نویسندگان

  • Alexander G. Hauptmann
  • Danny Lee
  • Paul E. Kennedy
چکیده

Informedia Digital Video Library Alexander G. Hauptmann, Danny Lee and Paul E. Kennedy Abstract The Informedia Digital Video Library Project includes a multilingual component for retrieval of video documents in multiple languages and a topic-labeling component for English video documents. We now extend this capability to English topic labeling of foreign-language broadcast-news stories. News stories are coarsely machine-translated into English, then assigned to a topic category using a K-nearest-neighbor algorithm. In preliminary tests on Croatian television news, topic assignment based on the best available machine translation technology showed performance only 8% worse (on a standard F-measure of performance) than that based on manual document translation. Using a phrase-based MT module the performance degradation was 31%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Labeling of Multilingual Broadcast News in the Informedi

The Informedia Digital Video Library Project includes a multilingual component for retrieval of video documents in multiple languages and a topic-labeling component for English video documents. We now extend this capability to English topic labeling of foreign-language broadcast-news stories. News stories are coarsely machine-translated into English, then assigned to a topic category using a K-...

متن کامل

Story Segmentation and Detection of Commercials in Broadcast News Video

The Informedia Digital Library Project [Wactlar96] allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can seg...

متن کامل

Lessons for the Future from a Decade of Informedia Video Analysis Research

The overarching goal of the Informedia Digital Video Library project has been to achieve machine understanding of video media, including all aspects of search, retrieval, visualization and summarization in both contemporaneous and archival content collections. The base technology developed by the Informedia project combines speech, image and natural language understanding to automatically trans...

متن کامل

Informedia: News-on-Demand Multimedia Information Acquisition

In theory, speech recognition technology can make any spoken words in video or audio media subject to text indexing, search and retrieval. This article describes the News-on-Demand application created within the InformediaTM Digital Video Library project and discusses how speech recognition is used for transcript creation from video, time alignment of closed-captioned transcripts, a speech quer...

متن کامل

Accessing News Video Libraries through Dynamic Information Extraction, Summarization, and Visualization

The Informedia Project has developed and evaluated surrogates, summary interfaces, and visualizations for accessing a digital video library containing thousands of documents and terabytes of data. This paper begins with a review of Informedia surrogates for a single video document, including titles, storyboards, and skims. Incorporating textual elements, considering user context and emphasizing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004